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Abstract 

Reacting against the limitation of statistics to decision procedures, R. A. Fisher proposed 
for inductive reasoning the use of the fiducial distribution, a parameter-space distribution of 
epistemological probability transferred directly from limiting relative frequencies rather than 
computed according to the Bayes update rule. The proposal is developed as follows using the 
confidence measure of a scalar parameter of interest. (With the restriction to one-dimensional 
parameter space, a confidence measure is essentially a fiducial probability distribution free of 
complications involving ancillary statistics.) 

A betting game establishes a sense in which confidence measures are the only reliable infer- 
ential probability distributions. The equality between the probabilities encoded in a confidence 
measure and the coverage rates of the corresponding confidence intervals ensures that the 
measure's rule for assigning confidence levels to hypotheses is uniquely minimax in the game. 

Although a confidence measure can be computed without any prior distribution, previ- 
ous knowledge can be incorporated into confidence-based reasoning. To adjust a p-value or 
confidence interval for prior information, the confidence measure from the observed data can 
be combined with one or more independent confidence measures representing previous agent 
opinion. (The former confidence measure may correspond to a posterior distribution with fre- 
quentist matching of coverage probabilities.) The representation of subjective knowledge in 
terms of confidence measures rather than prior probability distributions preserves approximate 
frequentist validity. 

Keywords: artificial intelligence; betting; coherence; confidence distribution; expert system; foun- 
dations of statistics; inductive reasoning; interpretation of probability; machine learning; personal 
probability; prior elicitation; subjective probability 
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1 Introduction 



Within the history of frequentism, the preeminent precedent for basing inductive inferenc es directly 
on hypothes i s probabilities i s that found in fiducial argume nts of the mature Fisher i Barnardl . 



1987t lZabeli Il992t lEdwardd . Il992l : lEfronl . Il998l : iFraseri [200(1 ) . who objected as staunchly against 



the beha yioristi c theory of N eyman and Pearson as against the Bayesian theories of Laplace and 
Jeffreys dFisheri. Il960l: Il973l) . Several interpretations of fiducial inference have been advanced. 
For example, iHackingl ljl965l ) attempted to justify it in terms of a derivation of logical parameter 
probabilities from statisti cal coverage r ates under a principle of irrelevance defined in terms of 
conditional independence llSharmal.]l980h . Fiducial- like arguments are now employed in the context 



HannigLliooij . The context 



of functional models (e. g. . iKohlasl . 120081 ) and generalized inference (e.g. 
of the present study is that of an intelligent agent formulating a probabilistic level of certainty of a 
hypothesis on the basis of observed data and possibly on the basis of more subjective information 
as well. 

A significance function is a cumulative distribution function (CDF) of a probability distribution 
called a confidence measure, which is equivalent to a fiducial probability measure if ancillarity 
considerations are neglected. The confidence measure succinctly includes all the information needed 
to compute any confidence intervals or p- values of any null hypotheses for a single scalar parameter 
of interest. As will be seen, such combination methodology allows the utilization of one or more 
subjective probability distributions of that parameter without forfeiting frequentist validity. As in 
strict Bayesian inference, each subjective distribution represents the knowledge of some expert or 
other intelligent agent, allowing the principled incorporation of existing information into an analysis 
of observed data. The use of such information in a frequentist framework in the case of a scalar 
parameter of interest is made possible by recent methodology for c ombining significan ce functions, 



originally intended for the meta-analysis of independent data sets 1 Singh et al. . 20051 ). 



1.1 Confidence measures 

An observed sample x £ f2 of n observations is modeled as a realization of the random quantity 
X of probability distribution P^ , where £ is the value of the parameter of some family of distribu- 
tions. Letting 9 = 9 (£) £ 9 denote the subparameter of interest and 7 = 7 (£) £ T the nuisance 
subparameter, (9, 7) is written in place of £ without loss of generality. 

Definition 1 (Significance function). The function 

F:!]x6^[0,l] 

is a significance function for 6 = 6 (£) if 

F (x, •) = F x (•) : 6 -> [0, 1] 
is a cumulative distribution function (CDF) for all x £ Q and if 

P {en) (F x (9) <a)=a (1) 

for all 9 £ 9, 7 £ T, and a £ [0, 1]. 

The condition of equation (I]) says that Fx (9) is a pivotal quantity with a uniform distribution 
on [0, 1] . 

The significance function encodes a rich set of confidence intervals, as follows. 

Lemma 2. If F is a significance function with inverse function F^ 1 : Cl X [0,1] — > 9, then 

P {6n) (9 £ (F^ ( ai ) , F^ 1 (1 - 02)] )=l-a 1 -a 2 (2) 

for all 9 £ 9, 7 £ T, and cti,a>2 £ [0, 1] such that a± + a 2 < 1. Conversely, consider the function 
F^ 1 : O X [0, 1] — ► 9 such that F" 1 is an inverse CDF for all x £ fl. If equation holds for all 
9 £ 9, 7 £ r, and ai,a 2 £ [0, 1] such that ot\ + a 2 < 1, then F : B x £1 — > [0, 1], the inverse of F~ l , 
is a significance function. 
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The straightforward proof is omitted. The significance function provides standard one- and 
two-sided p-values for testing the null hypothesis that 9 — 9' as well as exact (1 — ot\ — a 2 ) 100% 
confidence intervals. A test against the alternative hypothesis 9 > 9' , 9 < 9', or 9 ^ 9' re- 



spectively yields F x (9') 1 - F x {9'), or 2F X (9') A 2(1 - F x (9')) as the p-value [jFraserl . 11991 
Schweder and HjortL 12002 ) . 



The significance function is used to generate a probability measure of the interest parameter: 

Definition 3 (confidence measure). Consider F, a significance function for 9. For all x £ 0, if F x 
is the CDF of a random quantity d that has some probability distribution P x on the measurable 
space (8, £>), then P x is the confidence measure of 9 that corresponds to F given X = x. 



lEfronl Il 993) dubbed P x a confidence distribution, the term ISchweder and Hjortl ([2002) and 



Singh et alJ l|2005f ) instead attached to F due to the isomorphism noted below. To avoid confusion 



between the p robability measure and its CDF, F is herein called the significance function, following 



Frasen l|1991n . For clarity, the emphasis will be on the probability distribution rather than on 



the significance function since confidence measures take the place of Bayesian prior and posterior 
measures. The idea of a confidence distribution goes as far back as ICol (1958), who reco mmended 



the si multaneous consideration of confidence intervals for multiple levels of confidence. iPolanskv 



(2007|), referring to P x probabilities as attained or observed confidence levels, provides an accessible 
introduction to the concept and its applications. 

The connection between a confidence measure and such confidence intervals is directly made in 
this statement that the inferential probability that d is in a particular observed confidence interval 
is equal to the coverage rate of the random confidence interval that it realizes. 

Lemma 4. Given a random quantity $ that has some confidence measure P x on (0, B) correspond- 
ing to F given X = x, 

l- ai -a 2 = P x ($e(F- 1 (a 1 ),F- 1 (l-a 2 )]) (3) 
= P {e ^{9e{F^(a 1 ),F x 1 (l~a 2 )]) (4) 

for all x e f2, 9 € 9, 7 S T, and a±, a 2 £ [0, 1] such that a\ + a 2 < 1. 

Proof. Exact frequentist coverage at rate 1 — a\ — a 2 follows from Lemma [2j That rate is equal to 
the parameter-space probability given X = x: 

l- ai -a 2 = F x (F~ l (1 — a- 2 )) — F x (F~ x (a-i)) 

= P x (0 < F- 1 (1 - a 2 )) -P x (#< F- 1 (ax)) . 

□ 

This result will be generalized to arbitrary confidence sets in Section 12.1.21 
A confidence measure can be constructed from any significance function: 

Lemma 5. Given some significance function F, there is a random quantity $ of a confidence 
measure P x that corresponds to F given X = x such that, for all 9 G and x G fl, 

F x (9) = P x (d < 9) . (5) 

Proof. For all x £ ft, consider a function P x :B-* [0, 1] that satisfies P x ({9', 9"}) = F x (9 ")-F rr (9') 
for a ll 9', 9" € su ch that 9' < 9". By the Caratheodory extension theorem (e.g., ISchervishl 



19951 pp. 578-581 or iKarlenbergTbool pp. 26-27), there is a measure space (0,2?^) such that 



P x (©') = P x (0') for all 0' G B. Then P x is a confidence measure corresponding to F given X = x 
with the random quantity i9 : — > 0. □ 

Thus, every significance function evaluated at X — x is isomorphic to a confidence measure. 
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1.2 Intelligent agents 



For the sake of clarity and close contact with actual problems of statistical data analysis, decision- 
theoretic results will be presented in familiar terms of estimation rather than solely in terms of 
abstract decision makers. It is nonetheless often expedient to refer to such hypothetical agents to 
place the present work in context with the literature since many have found it co nvenient to imag- 
ine an ideally information-processing agent such as the robot of Carnapl 1 197ll l. especially when 



motivating axiomatic decision theory and its game-theoretic precursors ^2.2.1|) . While algorith- 
mic agents in artificial intelligence often make real decisions, agents in statistics instead inform 
a researcher or administrator who will consider the data analysis results and their underlying as- 
sumptions when making a decision that cannot be completely automated. To avoid confusion with 
actual people, pronouns and possessives referring to agents will be neuter. 

1.3 Overview 

Section [2] establishes properties of confidence measures that both motivate their use and guide their 
subjective assignment. Various definitions and lemmas of Sections 11.11 and [2 . II provide a framework 
for the accounts of coherence in Section 12.21 and for a game-theoretic attribute of t he co nfidence 
measure that gives precise, general content to the following reasoning. iKempthornd 1 1976L p. 224) 



considered fair odds for betting on the hypothesis that an observed confidence interval covers the 
parameter value to be a function of the rate of frequentist coverage p as if he were using a confidence 
measure P x , claiming that such a betting strategy would outperform a Bayesian, "coherently wrong" 
strategy. Heuristically, the thought is that in assessing a fair betting rate, achieving a reported 
frequenc y of correct decisions ove r repe ated sampling outweighs the importance of coherence over 
time; cf. iRobins and Wasserma n (2000). The rational component of Kempthor ne's assertion ha d 



been formally specified in terms of minimizing risk under a simple loss function 1 Cornfield! . 1969). 



That risk is generalized to a risk associated with testing arbitrary hypotheses in Section l2~3l which 
establishes that the only minimax solutions are confidence measures. 

Section [3] turns to the special case of subjective confidence measures as defined in Section |3~T1 A 
strategy of combining confidence measures, including one of more subjective measures, is proposed 
in Section l3~2l Guidance on the assignment of subjective confidence measures is then given in Section 
13.31 Subjective confidence may be assigned to hypotheses (1) indirectly by means of a hypothetical 
data set on which the agent might rest its opinion, (2) directly on the basis of minimaxity or other 
frequentist properties of the confidence measure, or (3) indirectly by transforming a Bayesian prior 
distribution into a confidence measure. 

The paper concludes with a brief discussion. 



2 Properties of confidence measures 

Sections 12.21 and 12.31 respectively record coherence and decision-theoretic criteria met by confidence 
measures using the terminology introduced in Section 12.11 

2.1 Confidence-based estimation 

Section 12 . 1 . 21 treats the problem of deriving a subset (x) of the parameter set that has a desired 
level of confidence or rate of coverage p; the set (x) is called a set estimate to distinguish it from 
a point estimate such as that of Section 12.1.11 

2.1.1 Hypothesis indicator estimation 

The degree to which a hypothesis is considered supported by data is defined as a point estimate of 
the value indicating whether the hypothesis is true: 

Definition 6. A function 

1 : B x O [0, 1] (6) 
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is called an indicator estimator on 6 x fl. For all 9 £ 9, 6' £ £, and x £ fi, the value 1 (Q',x), 
hereafter written as 1@> (x) , is an estimate of le' (0) . 



Remark 7. This agrees with the interpre tation of inferent i al or logical p robability as an estimate 
of the truth value of its hypothesis (e.g., IWilkinsonl . 11977c Ijeffrevl . 1 1986H . However, the definition 
is general enough to include in principle any function from £ x Q to R 1 by use of a monotonic 
transform to the conventional [0, 1] range. 

Evaluating the indicator estimator under squared error loss, iHwang et al.l 1 19921 ) found that 
F. {9') and 1 — F m (9') are admissible estimators of lpafe.fl') {&) an d l(e',supe) {&)■, respectively, in 
the case of exponential models, with 9 as the location parameter. The resulting squared-error 



admissibility of P* (i? < 9') as an estimato r of 1 



(inf 0,0' 



(9) is a weak condition satisfied by all 



generalized Bayes rules (jHwang et al. . 1992h regardless of their actual frequentist performance. 



2.1.2 Set estimation 

In order to lay the groundwork for the minimax result of Section 12.31 a general set estimator is 
defined in terms of the general indicator estimator in the same way as confidence intervals are often 
defined in terms of p-values. Let A denote the Lebesgue measure on R 1 and £([0,1]) the Borel 
cr-field of [0, 1] . 

Definition 8. A function 9 : £([0,1]) x Q, — > B is a set estimator and, if the map 9. (x) : 
£([0,1]) — > £ is bijective for all x £ O, then 9 is an invertible set estimator. Further, 9 is the 
set estimator corresponding to an indicator estimator 1 on £ x ft if 9 is a set estimator and if 
le B (x) ( x ) = M-^) f° r an x £ CI. Each observed 9^ (x) is a set estimate; X(B) is the level or 
nominal probability of a particular set estimator Ob with index B in £([0, 1]). 

The confidence coefficient and Bayesian credibility are examples of the level A (B) of a partic- 
ular set estimator. Each set B in £([0, 1]) is used to index a particular set estimator in order to 

facilitate working with a comprehensive collection of particular set estima- 

tors corresponding to the same indicator estimator. This proves more convenient than indexing 
particular set estimators with their levels since the same level can correspond to multiple particular 
set estimators. For example, the lower-tail {B — [0,0.95)), upper-tail {B = (0.05, 1]), and central 
(B = (0.025, 0.975)) 95% Bayesian credibility intervals represent three particular set estimators, 
each of the same level, 95%. Since B is a Borel set and since A is the Lebesgue measure on R 1 , less 
usual indices such as B = (0.05, 0.10) U (0.50, 0.99) are also possible. 

The following lemma and theorem are also needed for the game-theoretic result of the next 
section. 

Lemma 9. Suppose there are some significance function F and indicator estimator 1 on B x O 
such that 

i (e >,e»] (*) = F x (9") - F x {9') , (7) 

for all 9', 9" £ 9 such that 9' < 9" and for all x £ ft. If @ B : £ ([0, 1]) x ft -> £ is an invertible set 
estimator corresponding to 1, then 

\(B)=P {e ^(9£e B (X)) (8) 

for all B £ £([0,1]), 9 £ 9 and 7 £ T, Conversely, if there is an invertible set estimator 9 : 
£ ([0, 1]) x O — > £ corresponding to an indicator estimator 1 on B x fl such that equation (0) holds 
for all B £ £([0, 1]), 9 £ 9, and 7 £ T, then there is some significance function F such that 1 
satisfies equation (0) for all 6', 9" £ 9 such that 9' < 9" and for all x £ f2. 

Proof. According to Lemma [21 exact coverage iJH) holds for every set estimator &b (A) that maps 
to an interval subset of £. To prove exact coverage of every set estimator 9b (A) that maps to a 
union of disjoint interval subsets of £, note that 9s< (x) is the subset of 9s (x) corresponding to 
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subset B' of B for some x £ Cl according to the invertibility of 6. With B ([0, 1])' (B) denoting the 
set of all disjoint interval subsets of B, 

% 7) (^e B (i)) = Yl P {e , 7) (o e e B > (x) 

B'6B([0,1])'(S) 

£ X(B') = X(B) 

B'eS([0,l])'(B) 

for all B £ B([0,1]), 9 £ and 7 £ T, thereby proving the first half of the lemma. The converse 
follows from Lemma [2] and the fact that equation ([2j is a special case of equation (JSj) . □ 

Equation ([8]) says the level of any particular set estimator is equal to the actual coverage rate 
of that set estimator. Hence, the probability that is in a particular estimated set is equal to the 
coverage rate of the corresponding set estimator: 

Theorem 10. Suppose there are some indicator estimator 1 on BxCl and some significance function 
F such that, for all 0'e6 and x £ Cl, 

i e < (x) = P x 0? e ©') , (9) 

where d is a random quantity of measure P x , the confidence measure of 9 given X = x that 
corresponds to F. Let : B ([0, 1]) x Cl — > B denote any invertible set estimator corresponding to 1. 
Then 

P x (<? £ B (*)) = A (B) = P {6<1) (9 £ B (X)) (10) 

for all x £ CI, B £ B([0,i\), 9 £ and 7 £ T. Conversely, if there is an invertible set estimator 
: 6 ([0,1]) x CI — > B corresponding to an indicator estimator 1 on B x CI such that X(B) = 

P(e )7 ) (9 £ Ob (X)j for all B £ B ([0, 1]), 9 £ 0, and 7 £ T, then there is some significance function 

F and some confidence measure of 9 given X = x that corresponds to F such that equations 
and 03) hold for all x £ CI, B £ B ([0, 1]), 9 £ and 7 £ V. 

Proof. By Lemma [H equation (flQ|) holds for all interval elements of B ([0, 1]) . That result for 
intervals is extended to all unions of disjoint intervals in B ([0, 1]) by the invertibility as used in the 
proof of LemmaEl thereby proving the first half of the theorem. The converse follows directly from 
Lemmas [9] and [H □ 



Succinctly generalizing equation (jTUJ) to 9 as a vector parameter of interest, Polanskv 1 20071 . pp 



4-5, 69, 224-227) defined rather than derived P x (©') , the "attained confidence level" of 6 £ 0', to 
be the coverage frequency of a corresponding confidence set O pM (X) : 

P x (0') = P = P {e , j} (9 £ 6 pMp) (X)) , (11) 

where the coverage rate p and shape parameter u> (p) are constrained such that @ p>u ,( p ) (x) = 0' for 
the observed value x of random element X the distributio n P(g yl ) of which is indexed by parameter 
((6>,7)) . The "observed confidence level" (IPolanskvl. l2007h of equation (fTTI) should not be confused 
with theories of estimating confidence levels (jKieferl . Il977at iGoutis and Casellal . 11995). 

2.2 Axiomatic coherence 

Each of the next two subsections establishes the coherence of the certainty distribution P x from a 
distinct viewpoint that led to axioms of coherence or rationality. The first perspective is decision- 
theoretic and the second is logic-theoretic. 
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2.2.1 Axiomatic decision theory 



Precursors to axiomatic decision theory The use of the certainty distribution for decision 
making was motivated in Section [23] by placing the decision-making agent in the role of a casino that 
will settle bets at its published betting odds, allowing a gambling opponent to choose hypotheses 
on which to bet. This represents situations in which an agent must make a definite decision on the 
basis of limited information, as when it must either accept the hypothesis that the true parameter 
value is a pre-specified interval or accept the hypothesis that is is in the complement of that interval. 

That is essentially the gambling scenario for which Ramsay and de Finetti considered this Dutch 
book situation: a gambler can contract bets with any casino agent that assesses betting odds for 
certain ev ents in violat ion of probability theory such that the agent will lose regardless of the 
outcomes ( Gillied . 200(1 pp. 59-65). An agent or indicator estimator i is called cohe rent if it 
assigns betting odds in such a way that it will not suffer such sure loss. ISchervishl ([199.5) presents 
the equivalent mathematical definition of coherence to which the following proposition refers. 

Proposition 11. Let M. be the collection of all measurable maps from a measurable space (Ct, E) 
to (9, £>). An indicator estimator 1 on 8xS is coherent if and only if there is a probability measure 
P on (0, S) such that 1 Q ' (x) = P (tf G 6') for all & G M and 9' G B. 



Proof. This follows immedia tely from Schervis 
terminology closer to that of lde Finettil l|197l 



-yishl (|1995l . Theorem B.139), who uses notation and 

□ 



The definition of conditional probability has been rec overed by a simil ar theorem base d on bets 
that a re called off if some event does not occur; see, e.g., ISchervishl l|l995l pp. 657-658) or [Hacking! 
1 20011 1. In an idealized framework, setting conditional betting rates b y any parameter distribution 
other than a conditional probability distribution leads to certain loss l|Freedman and Purvesl . 1969; 
Cornfield . [l^lBuehlelll977t[Heath and Sudderthl . Il978l . Il989h . Since the probabilities in the the- 
orems provided have no time dependence, they do not indicate the method of replacing a parameter 
distribution after new data are observed and thus are compatible with the proposed method of re- 
placement by maintaining correct confidence interval coverage rates ( §2.3.2)1 . In Bayesian inference, 
on the other hand, the parameter distribution used to place bets after observing data is identified 
with the prior distribution conditional on the observed data. Such i dentification is an assumption 
that is usually hidden, not a consequence of coherence (|Bickell . l2009h . 

There are k nown problems with resting coherence on Dutch book theorems alone ( Levi . 2002; 
Howsonl . [2009). De Finetti admitted that arguments from betting behavior do not provide an 
unobjectionable foundation for coherent decision making (Gillies, 2000l ) . Ramsay also looked beyond 
the Dutch book argument , speculating that an axiomat i c foun dation encompassing both utility and 
probability could be laid ( French . 2000l . p. 30). ISavage (1954) proved the conjecture by drawing on 
the game theory conceived in mat hematical eco nomics, and others have since created generalizations 
of his axiomatic decision theory ( Frenchl . 120001 ) . 



Axiomatic decision theory proper Although axiomatic systems of decision theory were de- 
veloped with subjective probability in mind, nothing in the mathematics prohibits more objective 
applications by interpreting hypothesis probabilities as indicator estimates rather than as levels of 
belief. In fact, the axioms only put very weak constraints on rational decision-making that lead 
to coherently representing unknown values as random quantities without requiring the additional 
constraints of a prior distribution and the characteristically Bayesian use of conditional probabil- 
ity. In place of the latter constraints, the proposed framework substitutes the requirement that 
probabilities correspond to frequentist rates of coverage. 

While specifying a particular utility function for use with the axioms is inherently subjective, 
it is no more so than specifying a particular loss function for use in classical frequentist decision 
theory or a particular significance or confidence level for use in Neyman-Pearson theory. In order 
to objectively communicate the results of data analysis, probability distributions of parameters can 
be reported without utilities, as is common Bayesian practice. Accordingly, reporting a certainty 
distribution of a parameter allows each agent to supply its own loss function when making decisions 
on the basis of what can be inferred about the parameter value from the available data. 
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2.2.2 Axiomatic inductive logic 

While the axiomatic decision theories, building on foundations laid by Bayes 1 Jeffreys! . 19481 §1.3), 
Ramsay, and de Finetti, derive probability from the maximization of expec ted utility rather tha n 
vice versa ( §2.2. ip . many ha ve quest i oned t he propri e ty of the ord er (e.g.. iKardaun et all 120031) . 
That order was reversed bv iKevnesI <|l92lh . Ijeffreysl ()l948h . Cox (|l946t ; llQ6lh . iGoodl (|l950h . and 
Joyce ( 1998h . who constructed axiomatic formulations of inductive- logical probability on parameter 
space without relying on betting behavior, expected gain, or other decision-theoretic concepts. 

The term logical probability is used here in the broad sense of mathematical probability inter- 
preted according to any axiomatic system that generalizes some logic of deduction. Because such 
systems have been clo s ely associated with some version of the now discredited principle of insuffi- 
cient reason 1 Franklinl 2001 : Gillied . 2000l . p. 64), the statistical community has not deemed them 
a practical guide for data analysis. Logical probability may prove more useful in practice when 
supplemented instead with a frequentist principle such as one of minimizing arbitrary-hypothesis 
risk (Taj). 

The system of Cox 1961), while lacking mathem atical rigor remains h i ghly regarded its 



itical rigor remains h i ghly regarded its 
Paris! . I1994J : I Franklinl . l200ll : IVan Hornl . 



simp l icity and for th e generality of its assumptions (e.g. 

2003; Howsonl 20091 ) and continues to convince scientists to express uncertainty probabilistically 
at all 



Habeck et all 2005). Its two axioms may be expressed in the notation of S ectio n | 2. 31 with 



1 in the second axiom (|Coa . 119611 . pp 



(e.g., 

the addition of joint and conditional indicator estimators 
3-4): 

1. le\e' ( x ) is a smooth function of le' (x) for all x € S and 0' C 6. 

2. le',0" (x), the estimate of le' (0)Ale" (0) , is a smooth function of le' (x) and of le" (x\0 € 6'), 
the conditional estimate of 1 " (0) given 8 £ 9', for all x 6 5, C 6' C 6, and 9" C 9. 

From more gener al versions of those stated axioms, a few tacit assumptions, and the rules of classical 
logic , ICoxl dl96ll) pr oved i to be isomorphic to finitely additive probability ( Parisl . 19941 Van Hornl . 
2003; iHowsonl . l2009J) , allowing identification with the certainty distribution lfT5j) as well as with the 
Bayesian posterior that Cox originally had in mind. 



2.3 Game-theoretic interpretation 

2.3.1 Decisions as bets on hypotheses 

Caution is needed when drawing general conclusions from the losses suffer ed by gambling agents 
since such conclusions can be sensitive to the rules of the game (jFraserl . Il977l ) . Further, some games 
resemble situations faced in practice better than others. By construction, inference according to 
the proposed methodology is robust across two games so different that each had been used to argue 
for an opposite paradigm of statistics: 



1. Kempthorne ( 1976h and Kiefer 1 1977bh alluded to a game like that of Section [2751 to support 



Neyman-Pearson statistics; 

2. The game of posting fair betting odds for and against every hypothesis in some cr-field is the 
found ation of the traditional Dutch-book argu ment for Bayesian statistics. See I J. K. Ghoshl 
(2006, Appendix C) for an accessible summary of Schervish 1 19951 . pp. 654-655), who furnishes 
a more general theorem. 



2.3.2 Arbitrary-hypothesis minimaxity 

While pure Neyman-Pearson inference is optimal under a risk function that in effect imposes an 
infinite penalty for failing to control a Type I error rate at some specified level, such a risk function 
does not provide a helpful representation of all situations faced by the statistician. Many situ- 
ations that call for data-based decisions are better represented by a risk function representing a 
statistician's necessity to give odds for the hypothesis that an observed confidence interval covers 
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the parameter of interest such that a decision maker can use those o dds to safely bet either for or 
against that hypothesis as direct ed b y an opponent l|Cornfieldl . [l969l l: this game gives structure to 
the claims of Kempthorne ( 1976h and Kiefer Jl977bl ) that were mentioned in Section fOl 

That risk function is extended to accommodate more general hypothesis testing via the following 
zero-sum game played between a statistician and a client. The client will specify a pair of mutually 
exclusive and jointly exhaustive hypotheses to which the statistician must assign betting odds. 
Those odds determine the amount of either a payoff or penalty for the statistician, depending on 
which hypothesis is true. 

This situation is further stylized by representing the decision-making statistician as a casino 
agent and the opposing client as a gambler at the casino. The statistician applies a comprehensive 
collection of set estimators to data and, for each level-p set estimate, posts pj (1 — p) as fair betting 
odds for the event that the set estimate includes 9 to the event that the set estimate does not 
include 9. The set is comprehensive in the sense that its elements map to all elements of B for each 
In posting fair betting odds, the statistician announces a willingness commit to paying the 
client p or less if the set estimate does not include 9 provided that 1 — p or more would instead be 
received from the client if the set estimate includes 9. The statistician also must swap the payment 
amounts to bet that the set estimate does not include if the client desires. The client only 
accepts bet proposals at the odds the statistician considers fair, not favorable. Further, knowing 
the distributions of the set estimators in the statistician's set, the client will not accept unfavorable 
bets, that is, bets with negative risk to the statistician. The client enforces this by computing ui, 
the truly fair betting odds as defined by the ratio of the rate at which sets from the statistician 
cover to the rate of its non-coverage. The client then compares the fair betting odds to pj (1 — p) 
when deciding whether to accept a bet at odds pj (1 — p). Thus, the statistician only successfully 
contracts a bet on coverage if p/ (1 — p) > ui or on non-coverage if pj (1 — p) < u>. This contract is 
concisely represented in terms of loss suffered by the statistician: 

Me\& B( x)(0)-(l-p) W)W> P/(1-P)>w(fl) 
C B (Q; x) = Ul-p) 1 6b(X) (9) - pl eX e fl( x) (°) > Pf (1 - P) < " ( B ) 
[O, p/(l-p)=u(B), 

where B € B ([0, 1]) , p = A (B) , and 6 is an invertible set estimator mapping B ([0, 1]) x to B 
and corresponding to some indicator estimator 1 on B x 0; the fair betting odds of 8 £ 9b (X) to 
9 $l 6b (X) are given by 



w{B) 



(B) = 



P(fin) ( 9 e ®b (X)) 
P { e n) (e £ 9b (X) 



(12) 



resulting in the risk the statistician assumes by relying on 1 for assessing the odds of an arbitrary 
hypothesis. 

Definition 12. Consider V (l) , the collection of all invertible set estimators each mapping B([0, l])x 
Q to B and corresponding to some indicator estimator 1 on B x f2. The arbitrary-hypothesis risk of 
1 is 



R(e n ) (1) 



1 eec(i) max seZ3([o,i]) E (e n ) 



9; A 



(13) 



for all 6 e 6 and 7 € T. 

As in Neyman-Pearson testing, the hypotheses to be assessed are arbitrary in the sense that 
they are dictated by the needs of the current application and are thus outside of the agent's control. 
Additional arbitrary hypotheses may also be specified in the future for unforeseen applications. For 
the purpose of defining the risk associated with the indicator estimator 1 used to assess an arbitrary 
hypothesis, the worst-case specification of a hypothesis corresponds to the least-favorable selection 
of the corresponding set estimator 9b- Derivati on of a testing procedur e from a set estimator 
rather than vice v ersa is not without precedent I Scheffe . 1977 : Liu . 1997 : Efron and Tibshiranil . 
199atlGleserl . l2n02h . 
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Lemma 13. The indicator estimator 1 on B x is minimax to arbitrary-hypothesis risk if and 
only if there is an invertible set estimator 6 : B([0, 1]) X fl — ► B corresponding to 1 such that 

F M («6 9 B (I))=A(B) (14) 

for all B G B([0, 1]) , 6 G 6, and 7 G T. 

Proof. An indicator estimator 1 is minimax to arbitrary-hypothesis risk if and only if it minimizes 
max0 S e,7er R(d,y) (l) (Definition [121 . Given a particular set estimator 6b for any B G B([0, 1]) , 
the odds for 8 G 6b are A (B) / (1 — A (B)) as assessed by 1. If equation lfl4|) holds, those odds 
are equal to u) (B), the true odds given by equation (fl2|) . and thus -^(0,7) ^£b f®'-^)) = ^ ^ or 

all B G £([0,1]), 6» G 6, and 7 G T. Therefore, maxggQ .^gr R(8.j) (l) — 0. But if equation 
(ED does not hold, then A (B) / (1 - A (£)) ^ lo(B). If 3B e B ([0, 1]) , (9 G 6, 7 £ T such that 
A (B) I (1 - A (£?)) >lu(B), then 



£ <fli7) (£ B (e;X)J = A (B) P (M ^ ^ e fl (X)J - (1 - A (fl)) P {()7) [fl 6 6 B (X) 

E{en) (c B (e-,x)) X{B) 

7 ' V = 1 wm ~~ w \ B > > u ' 

{i-\{B))p {e „ ) (e$Q B (x)) 1 - A ( fi ) 

Likewise, if 3B G S([0, 1]) , G 6, 7 G T such that A (£) / (1 - A (B)) <u(B), then 

^,7>(^(6;X)) A(B) 

W W " 1 wm > 



X(B))P {etl) (d<tO B (X) 



l-X(B) 



for all G 6,7 G I\ Both results together indicate that if there is any B in S([0, 1]) such that 
equation (fl4| does not hold for any 9 G 6,7 G T, then max0 S e,7er R(6,-y) (l) > 0. □ 

That lemma leads to the corollary of Theorem [10] that establishes the unique minimaxity of the 
confidence measure. 

Corollary 14. TTie indicator estimator 1 on B x Vl is minimax to arbitrary-hypothesis risk if and 
only if there is some significance function F such that, for all 6' G B and x G Cl, 

ls> (x) — P x (i9 G 6') , (15) 

where d is a random quantity of law P x , the confidence measure of 9 that corresponds to F given 
X = x. 

Proof. By Lemma [13] this corollary obtains if and only if exact coverage (fT4|) holds for every 
particular set estimator 6b corresponding to the indicator estimator 1 given by equation (fT5|) , 
Theorem [10] supplies the necessary and sufficient conditions. 

□ 

Remark 15. The conditions of Theorem [JO] and Corollary [14] include continuous data and exact 
satisfaction of the uniformity condition for brevity and clarity. Applications to actual data often 
require judicious approxim ations. For example, a "ha lf-correction" makes the significance function 
applicable to discrete data ( Schweder and Hiort . 2002). As an alternative to approximate confidence 
measures, upper and lower proba bilities constituting envelopes of a class of confidence measures 
have been propos ed (Bickel, 2009) in the spirit of the Dempster-Shafer theory of bel ief functions 
llDempster . 2008t ). In that case, methods of combining confidence measures ( §3.2^ Singh et al 
20051 ) would apply separately to each confidence measure of the class. 
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3 Incorporation of previous information 



3.1 Subjective confidence measures 

Consider a set A of agents, a set T of f2-measurable maps, and a function t, : A — > T. The random 
quantity t a (X) represents relevant information accessible to any agent a £ A or what that agent 
can recall from previous experience. The information is subjective in the sense that 

t a ' (x) # t a " (x) , a', a" e A 

is possible for the same observed data x S 0. Errors in agent perception or memory could be 
modeled by, for each a 6 A, modeling t a as the realization of a random function T a , e.g., yielding 

T a (X) = {X 1 +Y{a),X 2 + Y(a),...,X n + Y(a)}, 

where Y (a) is some random variable independent of Xj. Likewise, agents themselves may be ran- 
domly selected from A, in which case A represents a random agent. 

For generality, P#, 7 will represent a joint probability measure extended from P(e. 1 ) such that 
Pe. 7 (X = •) is the distribution of the data on which each agent indirectly relies, Pe n (A = •) is 
the distribution of agents, and Pe. 7 (T = •) is the distribution of maps from fl to T. The trivial 
assignments Pe.-y (A — a) — 1 and P# i7 (T = t) = 1 are important special cases. If there is a 
function G : T x 6 — > [0, 1] such that 

Pe,7 (Gt a (x) (9) < a) = a (16) 

for all 9 e 9, 7 G F, and a € [0, 1] , then G is a significance function by Definition [TJ Let Q^^A 
denote the confidence measure of 9 that corresponds to G given (X, A, T) = (x, a, t). Because G and 
Q(x,a,t) d e p enc i on subjective information in the form of Ta, they are called a subjective significance 
function and a subjective confidence measure, respectively. By contrast, a confidence measure that 
does not depend on such subjective information is called objective. 



3.2 Updating a subjective confidence measure 

3.2.1 Combining subjective and objective measures 

Subjective Bayesianism makes use of agent knowledge by deriving the prior distribution of the 
parameters conditional on the data. The result is the Bayes posterior distribution of the parameters, 
from which a marginal posterior distribution of the parameter of interest may be obtained. The 
proposed substitute for this application of Bayes's theorem is the combination of the subjective 
confidence measure with the objective confidence measure by generating a new confidence measure. 
Due to the isomorphism between confidence measures and significance f unctio ns ($L~T]), any of 
the methods of combining significance functions studied by ISingh et al. (|2005h is equivalent to 
combining their corresponding confidence measures. Thus, I propose that such methods be used 
to combine subjective and objective confidence measures by combining their significance functions 
into a single significance function corresponding to a combined confidence measure. 



Example 16. ISingh et alJ (|200a ) proposed a method of significance function combination that 
relies on the choice of a continuous cumulative distribution function. Out of several such functions 
considered, DE, the cumulative distribution function of the double exp onential distributio n, per- 
forms the best under repeated sampling, especially for small sample sizes (jSingh et all 120051 ) . when 
the consideration of agent opinion has the most impact. For this function, the significance func- 
tion F, and L independent samples X (1) ,X (2) , ...,X (L) each drawn from P($ tl ), the combined 
significance function F is defined such that 



F (9) = DE L {DE- 1 (F x{1) (9)) + DE' 1 (F x(2) (9)) + ■■■ + DE' 1 (F X(L) 
for all 0€0, DEl (?) is the convolution of L copies of DE (q) , and 



(17) 
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DE~ X (p) = log (2p) I 

with / as the indicator function. Singh et al.1 ((2005 ^ found that the convolution may be computed 



from Vl (?) , a polynomial that satisfies a simple recursive relation, using 

DE L (q) =(l- ^V L (<?)) J [0fOo) (q) + ^—V L (-q) J ( _ oo , 0] (q) . 

The most commonly used polynomials are V\ (q) = 1, V 2 (q) = 1 + |, and V 3 (q) = 1 + 5g ^ q . 

A confidence measure formed by combining both objective and subjective confidence measures 
will be called an agent-updated confidence measure. Such formation will be referred to as agent-based 
confidence updating (ABCU). 

3.2.2 Reducing sensitivity to violations of matching 

Since the distribution of T a is typically unknown, an available subjective distribution is at best 
an approximation to a confidence measure. Combining subjective distributions with objective 
confidence measures can be made more robust to the influence of a subjective distribution's deviation 
from the properties of a confidence measure by treating a subjective distribution as if it were an 
objective confidence measure based on an incorrect model. In effect, a confidence measure derived 
from an incorrect model or from a misleading agent opinion is not a confidence measure of the true 
parameter value, but of some other underlying parameter value. Methods of adaptive confidence- 
measure combination assign weights such that a combination of confidence measures with different 
underlying parameter values as ymptotically has th e properties of a confidence measure with respect 
to the true parameter value 1 Singh et al. . 2005). Thus the value of the asymptotic combined 



confidence measure, when evaluated at the true value of the parameter, has the uniform distribution 
H|) needed. 

3.2.3 Multiple sampling distributions 

Confidence-measure combination applies independent samples drawn from different populations as 
well as to independent samples drawn from the same population. The following example illustrates 
this. 

Example 17. A standard common-mean problem involves estimating a mean 9 shared by two 
or more normal populations of unknown variances that may differ from one population to an- 
other. Confidence measure combination using equation (fT?]! yielded 95% confidence intervals 
with close to 95% coverage for simulated data drawn from two normal popula tions of 6 = 1 
for various choices of sample sizes and population variances JSingh et all l200fih . For one such 



choice, samples of sizes 3 and 4 were drawn from populations of variance 1 and 3.5 2 , respectively. 
For this illustration, those populations generated the realizations y\ = (0.523,2.460,1.119) and 
2/2 = (0.072, —2.275, —4.554, —0.077), two observed samples with means (ah, 22) and standard de- 
viations (si,S2). Their combined confidence measure F then gives 

F (6) = DE 2 

according to equation lfT7j) . Since the sample sizes are small, any agent knowledge may play an 
influential role in inference about the mean. Given that the subjective distributions on 8 for two 
independent agents are N (0, 3 2 ) and N (2, 4 2 ) , the agent-updated confidence measure gives 

F a (9) = DE±, 

where $ (q) is the standard normal cumulative distribution function. F a (6) is a confidence measure 
under the assumption of a matching agent space. The effect of including agent opinion may be 
quantified by noting, e.g., the impact on the p-value of Hq : # t ruc = — 1 versus H a : 6>truc > — 1; 
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F (— 1) w 0.104, whereas F a (—1) w 0.049. Like in the case of suitably assigned prior probabilities, 
the effect of incorporating agent knowle dge disappears as either sample size goes to infinity. 

Writing from a Bayesian perspective, Seidenfe ld ( 2007) challenged frequentism with a common- 
mean problem in the guise of the inferring the mass of a hollow cube from a direct measurement of 
the mass of the cube and a measurement of the mass of a ruler of known density. In what is arguably 
the most mature logical system of direct inference, the consideration of both measurements leads 
to a loss of precision compared to consideration of either measurement alone ( Kvburg . 20071 ). By 
contrast, confidence- measure combination enables the effective use not only of information in both 
measurements but also of any subjective information, as in Example fl7l 



3.3 Assignment of subjective confidence 

This section outlines three general methods of assigning a subjective confidence measure. 

However, it is not always necessary to obtain an entire confidence measure. For example, one 
subjective p-value suffices to obtain the agent-based p-value with respect to a single null hypothesis. 



3.3.1 Hypothetical-data confidence assignment 

One way to assign a Bayesian prior is to have t he ag ent generate a hypothetical data set on which 



its knowledge about 6 is based; see, e.g.. iLeld |200J). Likewise, using equation ifl6|) . a subjective 



confidence measure Q( x > a > 1 ) can be derived from the hypothetical data x on which agent a might have 
based its opinion. For that calculation, t either may be an identity function such that t (x) = x and 
Pe,7 (T = t) = 1 or may calibrate the data assignment to correct an elicitation bias. The technique 
will be illustrated with a simple example of generating hypothetical data to generate a subjective 
likelihood ratio. 



Example 18. Edwards! ( 19921 ). making use of the likelihood principle without relying on prior dis- 



tributions, suggested the use of subjective likelihood ratios under the paradigm in which inference is 
made directly from the actual likelihood function rather than from a posterior distribution derived 
from it or from likelihood functions based on unobserved data. His Example 3.5.1 supposes the 
knowledge of Torricelli regarding the atmospheric pressure jj,, prior to his taking measurements, is 
equivalent to the knowledge that would be gained by drawing 740 mmHg from a normal distribu- 
tion of unknown mean /x and known standard deviation 25 mmHg. Then, up to a proportionality 

constant, the subjective likelihood is exp ^— | ( 74 °^^ ) 2 V whereas the likelihood from his measure- 
ment, 760 mmHg, is exp ^— \ (^ 76 °-f J - since the sampling distribution is assumed normal with 
standard deviation 1 mmHg. Multiplying the two likelihood fun ctions yie l ds the likelihood function 
used for inference; its maximum occurs at /j, = 759.968 mmHg (|Edwardsl . [l99l . 



The same method can generate a subjective confidence measure. 

Example 19. In the problem of Example [TSl the subjective and objective likelihood functions are 
proportional to the densities of the subjective and objective confidence measures if P# l7 (A = a) = 1 
and Pe, 7 (T = t) = 1, where t is an identity function. Using equation ( lfl7|) ). the density of the 
agent-updated confidence measure reaches its maximum at /i = 759.231 mmHg. 



3.3.2 Direct confidence assignment 

Like Bruno de Finetti's prior probability of a hypothesis, the subjective confidence of a hypothesis 
can be defined as the perceived value of the opportunity to gain one unit of utility if the hypothesis 
is true. In other words, both the Bayesian subjective probability and the subjective confidence level 
are betting quotients that determine the decisions of some agent. 

While Bayesian prior probability is constrained only by coherence, confidence levels are also 
constrained by the rules of the game described in Section [2~31 modified as follows. In reporting set 
estimators to the client casino, the statistician agent a cannot use the data X but rather reduced 
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data t a {X) defined by some fi-measurable map t a . A set estimator 6b is only available to the agent 
if there is some function 0t such that 

©S (X) = ©I {t a (X)) (18) 

for all B e B([0, 1]) ,x € Q. The agent will achieve minimaxity under arbitrary- hypothesis loss if 
and only if the set estimates made on the basis of its choices of <d B s are isomorphic to confidence 
measures. With the set estimators selected by the agent under restriction (fl8|) . each confidence 
measure Q< x > a >*) is that agent's subjective confidence measure on the basis of information t (x) , 
where t — T if T is a fixed function or t is a realization of T if T is a random function. 

As in Bayesian statistics, subjective distributions need not be constructed by a formal process 
of eliciting the actual beliefs of a human individual or organization. Nonetheless, the definition 
of subjective confidence in terms of an agent's betting rate serves as a guiding principle for the 
specification of subjective confidence measures. 

The foundational idea of this frequentist methodology may be motivated by the corresponding 
Bayesian methodology. From an idealized Bayesian standpoint, inasmuch as an agent's elicited 
opinion has been coherently formulated from observed data such as summaries of data more di- 
rectly observed by others, the agent's prior will equal the posterior distribution that would have 
been obtained by applying Bayes's theorem to the former data were they still available. Similarly, 
since under frequentism a set of p-values or confidence intervals takes the place of the posterior 
distribution as the inference result, one may consider eliciting p- values or confidence intervals from 
an agent that, in the ideal case, equal that would have been computed given the now unavailable 
data on which the agent opinion was based. Then, in analogy with how the Bayesian combines a 
subjective prior distribution with a likelihood function obtained from data, the frequentist may use 
confidence-measure combination to combine subjective p-values and confidence intervals with those 
obtained from data. The confidence measure provides a framework for such combination by encap- 
sulating the information of p- values and confidence intervals into a confidence measure. Informally, 
given a sample of fixed observations, the cumulative distribution function of the standard confidence 
measure maps each null hypothesis parameter value to its upper-tailed p-value. Likewise, given a 
fixed elicitation of an agent's opinion, the cumulative distribution function of the subjective confi- 
dence measure maps each null hypothesis parameter value to the upper-tailed p-value assigned by 
the agent. Alternatively, the agent's subjective distribution may be approximated by interpolating 
the lower-tailed p-values provided at a sufficient number of null hypothesis parameter values or by 
interpolating the confidence intervals provided at a sufficient number of confidence levels. To the 
extent that each randomly selected agent's knowledge is correctly summarized in such a subjective 
distribution, one-sided p-values derived from those priors follow a uniform distribution under the 
truth of the null hypothesis. 



3.3.3 Bayesian confidence assignment 

Due to the novelty of assigning subjective confidence directly, the guidance provided in Section 
13.3.21 might be less reliable in the case of a human agent than the application of mature procedures 
of eliciting B a yesian prior distributions and of correcting them to ens ure coherence. For examples, 
see 



licitmg .Bayesian prior distributions and ot correcting them to ensu 
Chalonen (l996h . ICraig et all ((1998). and lGarthwaite et all (|2005l ). 



If a Bayesian prior elicited from an agent equals a Bay esian posterior computed from a probability- 
matching initial distribution l|Datta and Mukerjed . 120041 1 and the data on which the age nt indirectly 
based its prior, then the prior asymptotically approaches a confidence measure; see ISingh et al.l 
( 2005|) on asymptotic significance functions. In this sense, a Bayesian prior is approximately equal 
to a subjective confidence measure suitable for combination with an objective confidence measure. 
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Objective input only 


Subjective input included 


Frequentist calcu- 
lations 


Objective confidence measures or 
combinations thereof are used for 
inference 


Combinations of objective and 
subjective confidence measures 
are used for inference 


Bayesian calcula- 
tions 


Starting with an improper prior 
distribution, the resulting poste- 
rior distribution is used for infer- 
ence 


Starting with an agent's proper 
prior distribution, the resulting 
posterior distribution is used for 
inference 



Table 1: Comparison of subjective Bayesianism and the proposed subjective frequentism (the agent- 
updated confidence measure) to their counterparts that do not formally rely on agent knowledge. 



4 Discussion 

4.1 Reliability of confidence measures 

The reliability of P x in the form of confidence matching (Theorem fTOj) and arbitrary-hypothesis 
minimaxity (Corollary [14]) means the decision-making agent cannot suffer any expected loss due 
to a strategy of placing bets over repeated samples in the above game-theoretic framework. Con- 
sequently, if the game were modified such that at least some of the bets placed are favorable, the 
agent would accrue an expected gain. However, the benefit of achieving minimax risk and the stated 
rates of interval coverage is not limited to those rare or non-existent situations in which more than 
one sample is drawn from the same population; rather, those properties reflect the reliability of 
methods satisfying them. 



4.2 Other approaches to subjective information 
4.2.1 Objective and subjective Bayes 

Considering ABCU as subjective frequentism elucidates its relationship to standard Bayesian ap- 
proaches (Table [1]). The comparison involves some blurring between Bayesianism and frequentism, 
as posterior distributions with probability-matching priors are approximate confidence measures. 
This suggests that the choice of whether or not to incorporate agent knowledge may often tend 
to have more impact on inference than the choice of whether to perform Bayesian or frequentist 
calculations. 

Whereas application of the agent-updated confidence measure is only based on previous infor- 
mation about a one-dimensional parameter of interest, a subjective Bayesian analysis can also make 
use of any information that is also available for the nuisance parameters, often leading to more reli- 
able inference. However, even when agents do have such information, it is only rarely elicit ed, and , 
in pr actice, improper priors tend to be put on the vast majority of nuisance parameters 1 Bergerl . 



2004). That the agent-updated confidence measure requires the elicitation of information on only 



one scalar parameter may enable researchers to incorporate at least some subjective knowledge into 
their analyses. 



4.2.2 Previous non-Bayesian approaches 

ABCU is not the only non-Bayesian framework available for use of subjective information about 
the scalar parameter of inter e st. Ex ample fl8l illustrates such use in the direct-likelihood framework 
further developed by lRovai3 1 1997t l . The likelihood principle of that framew ork is not followed by 
Nevman-Pearson uses of subjective likelihood, also called a likelihood penalty (jSchweder and Hiortl . 
20021 ). 



Perhaps those previous non-Bayesian methods of incorporating agent knowle dge are seldom 
used because they require elicitation of likelihood ratios, which, as Royal ll I 2000l ) conceded, are 
understood by few scientists. If so, then methods requiring only the elicitation of either a p-value 
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or of a few confidence intervals, as guided by one of the methods of Section [3j3j ma Y have wider 
appeal. 

Also obviating the elicitation of a likelihood function. iFraser and Reidl (|2002h consider assigning 
a subjective prior to a scalar parameter of interest in the location parameterization. To the extent 
that its treatment of nuisance parameters approximates the assignment of diffuse priors to such 
parameters, this approach will give results similar to those of the more classical Bayesian approach. 
However, the reparameterization approach is better understood on its own terms: applying Bayes's 
theorem to a uniform prior i n the location paramet erization yields a posterior that matches fre- 
quentist confidence intervals l|Fraser and ReidL l2002h . i.e., the Bayesian posterior is a confidence 
measure. That other priors in general bring departures from this matching property distinguishes 
this method from ABCU. This does not necessarily indicate the superiority of the latter, as the 
preservation of matching probability even under the use of subjective information comes in exchange 
for inference additivity, an issue addressed below. 

More generally, one may multiply the probability density function of a matching prior by a 
likelihood or pseudo-likelihood that is only a function of the scalar i nterest parameter, y i elding , 
after normalization, a posterior distribution for inference. Following ISchweder and Hjortl (|20Q2h . 
such a likelihood or pseudo-likelihood function is considered reduced since, unlike the full likelihood 
function, it does not depend on the nuisance parameters. If this reduced likelihood is proportional 
to the density function of the objective confidence measure, then the posterior probability will equal 
the normalized product of the density of the subjective co nfidence measure and the density of the 
objective confidence measure. (jSchweder and Hjortl (|2002h found that the proportionality property 
holds for some reduced likelihoods.) Under certain conditions, that normalized product asymptot- 
ically approaches the combined confidence measure found by equation (lfT7|)) with the substitution 
of the matching prior distribution for a confidence measure. In general, however, directly using that 
equation leads to more exact confide nce measures than those obtained by multiplying confidence 
measure densities ( Singh et all l2005h . 

The principle of inference additivity mentioned above means the totality of inferences resulting 
from several analyses, each based on different information, is identical to the i nference from th e 
single analysis based on simultaneous consideration of all of the information; cf. lEdwardsl <|l992h . 
The loss of inference additivity is the most obvious drawback of ABCU compared to uses of Bayes's 
theorem. In the context of agent-updated confidence measures, the way in which three or more 
confidence measures are combined may affect the result, e.g., combining the combination of two 
confidence measures with a third yields a confidence measure that is not necessarily equal to that of 
the simultaneous combination of all three. (The combination of the objective confidence measures 
of Section 13.2.31 with the combination of both subjective confidence measures results in a p-value 
of 0.054 instead of the simultaneous-combination p-value of 0.049.) 

It has also been noted that inference based on confide nce measures, unlike that ba sed on Bayesian 
posterior distributions, violates the likelihood principle (jSchweder and Hjortl . 120021 ). ABCU shares 
this violation with other methods designed to have correct coverage when agent opinion is not 
incorporated. 

It may be concluded that the optimality of one method over another will depend largely on 
the availability of information from agents and on the relative desirability of each of the inference 
principles and frequentist properties. Both methods using generalized confidence measures share a 
new way to formalize agent knowledge. The unique benefit of ABCU is its production of probability 
statements that are correct in the frequentist sense even after incorporating that knowledge. 
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